{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Python 机器学习实战 ——代码样例\n",
    "\n",
    "# 第七章 迭代器\n",
    "\n",
    "## 7.1. 可迭代对象 (Iterable)\n",
    "\n",
    "凡是可以返回一个迭代器的对象都可称之为可迭代对象，对象中有 iter 或者 getitem 方法。很多容器（list、dict 这些都是容器）都是可迭代对象，此外还有类似打开状态的files，sockets等。\n",
    "\n",
    "可以通过使用内建的工厂函数 iter() 获取迭代器对象，然后就可以使用 next() 方法来取得迭代器内的内容\n",
    "\n",
    "迭代器是访问集合元素的一种方式。迭代器对象从集合的第一个元素开始访问，直到所有的元素被访问完结束。迭代器只能往前不会后退，不过这也没什么，因为人们很少在迭代途中往后退。另外，迭代器的一大优点是不要求事先准备好整个迭代过程中所有的元素。迭代器仅仅在迭代到某个元素时才计算该元素，而在这之前或之后，元素可以不存在或者被销毁。这个特点使得它特别适合用于遍历一些巨大的或是无限的集合，比如几个G的文件\n",
    "\n",
    "可迭代对象的特点：\n",
    "\n",
    "- 访问者不需要关心迭代器内部的结构，仅需通过next()方法不断去取下一个内容\n",
    "- 不能随机访问集合中的某个值 ，只能从头到尾依次访问\n",
    "- 访问到一半时不能往回退\n",
    "- 便于循环比较大的数据集合，节省内存\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "a\n",
      "bb\n"
     ]
    }
   ],
   "source": [
    "# 通过迭代器的方法访问可迭代对象\n",
    "\n",
    "list_test = ['a', 'bb', 'ccc', 'a1']\n",
    "list_iter = iter(list_test)\n",
    "print(next(list_iter))\n",
    "print(next(list_iter))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['__add__',\n",
       " '__class__',\n",
       " '__contains__',\n",
       " '__delattr__',\n",
       " '__delitem__',\n",
       " '__dir__',\n",
       " '__doc__',\n",
       " '__eq__',\n",
       " '__format__',\n",
       " '__ge__',\n",
       " '__getattribute__',\n",
       " '__getitem__',\n",
       " '__gt__',\n",
       " '__hash__',\n",
       " '__iadd__',\n",
       " '__imul__',\n",
       " '__init__',\n",
       " '__iter__',\n",
       " '__le__',\n",
       " '__len__',\n",
       " '__lt__',\n",
       " '__mul__',\n",
       " '__ne__',\n",
       " '__new__',\n",
       " '__reduce__',\n",
       " '__reduce_ex__',\n",
       " '__repr__',\n",
       " '__reversed__',\n",
       " '__rmul__',\n",
       " '__setattr__',\n",
       " '__setitem__',\n",
       " '__sizeof__',\n",
       " '__str__',\n",
       " '__subclasshook__',\n",
       " 'append',\n",
       " 'clear',\n",
       " 'copy',\n",
       " 'count',\n",
       " 'extend',\n",
       " 'index',\n",
       " 'insert',\n",
       " 'pop',\n",
       " 'remove',\n",
       " 'reverse',\n",
       " 'sort']"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 我们列出list 的所有属性和方法，我们可以看到 __getitem__ 方法\n",
    "\n",
    "dir(list)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 检查一个类中有没有某个方法，可以简单的用 hasattr 方法\n",
    "\n",
    "hasattr(list, \"__getitem__\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "上面简单的例子为了说明迭代，那么迭代到底有什么好处，我们什么时候应该用迭代呢？\n",
    "\n",
    "## 7.2. 迭代器 (Iterator)\n",
    "\n",
    "迭代器实现了工厂模式的对象，它在你每次询问要下一个值的时候返回给你。\n",
    "\n",
    "对于 Python 本来就支持随机访问的数据结构 ( 如 tuple、list )，迭代器和经典 for 循环的索引访问相比并无优势。但对于无法随机访问的数据结构 ( 比如 set ) 而言，迭代器是唯一的访问元素的方式。\n",
    "\n",
    "迭代器的优点就是它不要求你事先准备好整个迭代过程中所有的元素。迭代器仅仅在迭代至某个元素时才计算该元素，而在这之前或之后，元素可以不存在或者被销毁。这个特点使得它特别适合用于遍历一些巨大的或是无限的集合，比如几个 G 的文件，或是斐波那契数列等等。这个特点被称为延迟计算或惰性求值( Lazy evaluation )。\n",
    "\n",
    "迭代器是一个有 next 方法的对象，调用 next 方法可以返回对象中的下一个值，任何实现了 iter 和 next 方法的对象都是迭代器，iter 返回迭代器自身，next 返回容器中的下一个值。\n",
    "\n",
    "下面，我们写一个具有迭代功能的类。用户设定 start 和 end 的值，然后用 for 遍历的方式来返回其中的奇数。可以理解为这个 class 能返回所有的奇数，只要你需要。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "It is my iterator\n",
      "1\n",
      "3\n",
      "5\n",
      "7\n"
     ]
    }
   ],
   "source": [
    "class MyIter01:\n",
    "    def __init__(self, start = 0, end = 0):\n",
    "        self.start = start\n",
    "        self.end = end\n",
    "        \n",
    "    def __iter__(self):\n",
    "        print('It is my iterator')\n",
    "        return self\n",
    "    \n",
    "    def __next__(self):\n",
    "        i = (self.start - 1) *2 + 1\n",
    "        if i < self.end:\n",
    "            self.start += 1\n",
    "            return i\n",
    "        else:\n",
    "            raise StopIteration()\n",
    "\n",
    "iter_test = MyIter01(1, 8)\n",
    "for i in iter_test:\n",
    "    print(i)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1\n",
      "3\n",
      "5\n",
      "7\n"
     ]
    }
   ],
   "source": [
    "# 用 next 方法来访问迭代器\n",
    "\n",
    "iter_test = MyIter01(1, 8)\n",
    "print(next(iter_test))\n",
    "print(next(iter_test))\n",
    "print(next(iter_test))\n",
    "print(next(iter_test))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "通过这个简单的例子可以看到迭代器的强大功能。在这个例子基础上扩展，可以完成更加复杂的业务逻辑，而执行这些迭代器，就和使用 Python 自带的 list 一样简单。通过迭代器写一个斐波那契数列的例子比较多，就不赘述了。\n",
    "\n",
    "直接输出迭代器的内容是不行，需要通过 list() 转换。我们来看下面这个例子。首先，生成一个迭代器：\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<__main__.MyIter01 object at 0x000001D29240B668>\n",
      "It is my iterator\n",
      "[1, 3, 5, 7]\n",
      "It is my iterator\n",
      "[]\n"
     ]
    }
   ],
   "source": [
    "# 生成一个迭代器\n",
    "iter_test = MyIter01(1, 8)\n",
    "\n",
    "# 显示迭代器本身\n",
    "print(iter_test)\n",
    "\n",
    "# 第一次显示列表化的迭代器\n",
    "print(list(iter_test))\n",
    "\n",
    "# 第二次再显示一次列表化的迭代器\n",
    "print(list(iter_test))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "由于迭代器的特性，所以第一次输出没问题，但是第二次输出的时候，就是一个空的 list 了，因为所有的值已经迭代完了。当然我们可以通过重新初始化 iter_test 来实现，但在实际中会有风险，最简单的办法，还是预先定义一个转换好的 list 比较简单实用，也就是需要全量的迭代器内容，就用 list() 预先转换一个，其他需要当做迭代器使用的时候，还是用 for 语句或者 next()。\n",
    "\n",
    "从编程的角度来说，很多时候并不需要有所谓“完美的完整实现”，而是不同的数据结构、模块、算法等各司其事，在特定的环境完成需要的任务，因为这其实意味着效率，一段什么都能做的代码，肯定是非常复杂的，至少在效率上就会有所损失。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0\n",
      "2\n",
      "4\n",
      "6\n",
      "8\n"
     ]
    }
   ],
   "source": [
    "# 看这段代码，想一下执行结果是什么\n",
    "\n",
    "a = iter(list(range(10)))\n",
    "for i in a:\n",
    "    print(i)\n",
    "    next(a)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "next() 方法将迭代的任务完成了，因此输出的时候，就是一个偶数序列了。注意，如果不把 list 进行 iter() 的话是不能这样使用的，程序会报错，如下所示。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0\n"
     ]
    },
    {
     "ename": "TypeError",
     "evalue": "'list' object is not an iterator",
     "output_type": "error",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-8-78d92a399278>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m      2\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mi\u001b[0m \u001b[1;32min\u001b[0m \u001b[0ma\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m      3\u001b[0m     \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mi\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 4\u001b[1;33m     \u001b[0mnext\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0ma\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[1;31mTypeError\u001b[0m: 'list' object is not an iterator"
     ]
    }
   ],
   "source": [
    "a = list(range(10))\n",
    "for i in a:\n",
    "    print(i)\n",
    "    next(a)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 7.3.\tItertools 模块\n",
    "\n",
    "Python 内置了一个模块 itertools，包含了很多函数用于创建更有效率的循环迭代器，比如 count()。\n",
    "\n",
    "### 7.3.1. count()函数\n",
    "\n",
    "count(start, [step]) 从 start 开始，以后每个元素都加上 step，step 默认值为 1。\n",
    "\n",
    "count()可以被理解为输出一个无限的序列，如果我们要输出一个指定范围内的奇数序列的话，可以使用 islice() 函数 ( 可迭代序列的 slice )。\n",
    "\n",
    "count() 语法：itertools.count(start=0, step=1)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0, 1, 2, 3, 4]\n",
      "[1, 3, 5, 7]\n"
     ]
    }
   ],
   "source": [
    "from itertools import islice, count\n",
    "\n",
    "list_01 = [i for i in islice(count(), 5)]\n",
    "print(list_01)\n",
    "\n",
    "list_02 = [i for i in islice(count(), 1, 9, 2)]\n",
    "print(list_02)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7.3.2.\tcycle()函数\n",
    "\n",
    "cycle() 语法：itertools.cycle(iterable)，迭代至序列 p 最后一个元素后，再从序列 p 的第一个元素重新开始。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0, 1, 2, 3, 0, 1, 2, 3, 0, 1]\n"
     ]
    }
   ],
   "source": [
    "from itertools import *\n",
    "list_01 = [i for i in islice(cycle(range(4)), 10)]\n",
    "print(list_01)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7.3.3.\trepeat()函数\n",
    "\n",
    "repeat() 语法：itertools.repeat(object[, times])\n",
    "将 object 重复 times 次，如果不指定，则无限重复。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['a', 'a', 'a']\n"
     ]
    }
   ],
   "source": [
    "from itertools import *\n",
    "\n",
    "list_01 = [i for i in (repeat('a', 3))]\n",
    "print(list_01)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7.3.4.\taccumulate()函数\n",
    "\n",
    "accumulate(iterable[, func])\n",
    "\n",
    "accumulate() 可以将一个函数连续地作用在可迭代序列上，和 reduce() 函数一样，先对第一个和第二个元素计算，再将结果和第三个元素进行计算，以此类推。\n",
    "\n",
    "accumulate() 函数和 reduce() 函数是有点相似的，reduce() 函数是对可迭代对象进行收敛，最后只输出一个元素，accumulate() 既然作为迭代器的函数，它输出的元素和输入是等长的。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1, 3, 6, 10, 15]\n",
      "[1, 6, 12, 19, 27]\n"
     ]
    }
   ],
   "source": [
    "# 演示 itertools 的 accumulate() 函数\n",
    "from itertools import *\n",
    "\n",
    "# 默认的 func 是一个 add\n",
    "list_01 = list(accumulate([1,2,3,4,5]))\n",
    "print(list_01)\n",
    "\n",
    "# 我们定义一个自己的计算函数\n",
    "def my_func(a,b):\n",
    "    return a + b + 3\n",
    "\n",
    "list_02 = list(accumulate([1,2,3,4,5], my_func))\n",
    "print(list_02)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0, 1, 2, 3, 0, 1, 2, 3, 0, 1]\n",
      "[0, 1, 2, 3, 3, 3, 3, 3, 3, 3]\n"
     ]
    }
   ],
   "source": [
    "# 演示 itertools 的 accumulate() 函数\n",
    "from itertools import accumulate, islice, cycle\n",
    "\n",
    "# 生成一个可迭代对象\n",
    "list_01 = list(islice(cycle(range(4)), 10))\n",
    "print(list_01)\n",
    "\n",
    "# 对于 accumulate 引入 max() 函数\n",
    "list_02 = list(accumulate(list_01, max))\n",
    "print(list_02)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7.3.5.\tchain()函数\n",
    "\n",
    "chain() 迭代器，它能够将多个可迭代对象合并成一个更长的可迭代对象。我们可以看看下面两个小例子："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1, 2, 3, 'a', 'b', 'c']\n"
     ]
    }
   ],
   "source": [
    "from itertools import chain\n",
    "print(list(chain([1, 2, 3], ['a', 'b', 'c'])))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1\n",
      "2\n",
      "3\n",
      "a\n",
      "b\n",
      "c\n"
     ]
    }
   ],
   "source": [
    "from itertools import chain\n",
    "\n",
    "i0 = chain([1, 2, 3], ['a', 'b', 'c'])\n",
    "\n",
    "for i in i0:\n",
    "    print(i)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7.3.6.\tcompress()函数\n",
    "\n",
    "compress() 这个迭代器也很有趣，它会交替检查两个可迭代对象，如果第二个的是 True，则保留第一个中相应的元素。在很多场景下这是非常有用的。举例如下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['A', 'D', 'E', 'G']\n"
     ]
    }
   ],
   "source": [
    "from itertools import compress\n",
    "\n",
    "letters = 'ABCDEFG'\n",
    "bools = [1, 0, 0, 1, 1, 0, 1]\n",
    "\n",
    "i0 = compress(letters, bools)\n",
    "print(list(i0))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7.3.7.\tdropwhile()函数\n",
    "\n",
    "dropwhile() 需要输入两个参数：predicate 和 iterable，一个断言和一个可迭代对象，然后会根据这个断言应用在可迭代对象上，当不符合断言时，输出后面所有的迭代结果。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[10, 30, 2, 4]\n"
     ]
    }
   ],
   "source": [
    "from itertools import dropwhile\n",
    "\n",
    "print(list(dropwhile(lambda x: x < 10, [1, 5, 7, 10, 30, 2, 4])))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "10\n",
      "30\n",
      "2\n",
      "4\n"
     ]
    }
   ],
   "source": [
    "from itertools import dropwhile\n",
    "\n",
    "i0 = dropwhile(lambda x: x < 10, [1, 5, 7, 10, 30, 2, 4])\n",
    "    \n",
    "print(next(i0))\n",
    "print(next(i0))\n",
    "print(next(i0))\n",
    "print(next(i0))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7.3.8.\tfilterfalse()函数\n",
    "\n",
    "filterfalse() 是一个过滤器，如果断言为 false 的话，输出可迭代对象的内容。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1, 5, 7, 2, 4]\n"
     ]
    }
   ],
   "source": [
    "from itertools import filterfalse\n",
    "\n",
    "def greater_than_ten(x):\n",
    "    return x >= 10 \n",
    "\n",
    "print(list(filterfalse(greater_than_ten, [1, 5, 7, 10, 30, 2, 4])))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "filterfalse() 是一个过滤器，如果断言为 false 的话，输出可迭代对象的内容。我们可以用 lambda 简化如下，在迭代序列中不满足断言的都被输出了\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1, 5, 7, 2, 4]\n"
     ]
    }
   ],
   "source": [
    "from itertools import filterfalse\n",
    "\n",
    "print(list(filterfalse(lambda x:x>=10, [1, 5, 7, 10, 30, 2, 4])))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "下面的代码和之前的 dropwhile()比较一下。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[10, 30]\n",
      "[10, 30, 2, 4]\n"
     ]
    }
   ],
   "source": [
    "from itertools import filterfalse,dropwhile\n",
    "\n",
    "print(list(filterfalse(lambda x:x<10, [1, 5, 7, 10, 30, 2, 4])))\n",
    "print(list(dropwhile(lambda x: x < 10, [1, 5, 7, 10, 30, 2, 4])))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7.3.9.\tgroupby()函数\n",
    "\n",
    "itertools 中的 groupby() 函数非常酷，就像 SQL 中的 groupby 一样，可以将元素按照指定的 key 来归类。\n",
    "\n",
    "语法：itertools.groupby(iterable, key=None)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[('Apple', <itertools._grouper object at 0x000001D292435A90>), ('Huawei', <itertools._grouper object at 0x000001D292435B38>), ('Microsoft', <itertools._grouper object at 0x000001D292435B70>)]\n"
     ]
    }
   ],
   "source": [
    "from itertools import groupby\n",
    " \n",
    "list_0 = sorted([('Apple', 'iPhone'), ('Apple', 'iMac'),('Microsoft', 'Surface'), ('Apple', 'iPad'),\n",
    "          ('Huawei', 'P9'), ('Huawei', 'P10')])\n",
    "\n",
    "print(list(groupby(list_0, lambda make: make[0])))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对一个排序后的 list 进行 gruopby 操作，可以看到，在输出的 key 结果里，每个元组的第二个元素都是一个可迭代对象，我们可以用循环来获得这个可迭代对象的值。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "iMac is made by Apple\n",
      "iPad is made by Apple\n",
      "iPhone is made by Apple\n",
      "\n",
      "P10 is made by Huawei\n",
      "P9 is made by Huawei\n",
      "\n",
      "Surface is made by Microsoft\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from itertools import groupby\n",
    " \n",
    "list_0 = sorted([('Apple', 'iPhone'), ('Apple', 'iMac'),('Microsoft', 'Surface'), ('Apple', 'iPad'),\n",
    "          ('Huawei', 'P9'), ('Huawei', 'P10')])\n",
    "\n",
    "for key, group in groupby(list_0, lambda make: make[0]):\n",
    "    for make, item in group:\n",
    "        print('{item} is made by {make}'.format(item=item, make=make))\n",
    "    print ('')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7.3.10.\tstarmap()函数\n",
    "\n",
    "语法：itertools.starmap(function, iterable)\n",
    "\n",
    "startmap() 对于一个可迭代对象进行指定函数的计算，先来看看下面的例子：\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[(1, 4, 4), (2, 4, 8), (3, 4, 12), (4, 4, 16)]\n"
     ]
    }
   ],
   "source": [
    "from itertools import starmap\n",
    "\n",
    "print(list(starmap(lambda x,y: (x, y, x * y), [(1, 4), (2, 4), (3, 4), (4, 4)])))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[(1, 4, 4), (2, 4, 8), (3, 4, 12), (4, 4, 16)]\n"
     ]
    }
   ],
   "source": [
    "# 也可以这样写：\n",
    "def func(t):\n",
    "    return t[0], t[1], t[0]*t[1]\n",
    "\n",
    "print(list(map(func, [(1, 4), (2, 4), (3, 4), (4, 4)])))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7.3.11.\ttakewhile()函数\n",
    "\n",
    "takewhile() 和 dropwhile() 相反，它会返回符合条件的元素，但是一旦有元素不符合条件，就停止了。可以和 dropwhile() 比较一下。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1, 5, 7]\n"
     ]
    }
   ],
   "source": [
    "from itertools import takewhile\n",
    "\n",
    "print(list(takewhile(lambda x: x < 10, [1, 5, 7, 10, 30, 2, 4])))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[10, 30, 2, 4]\n"
     ]
    }
   ],
   "source": [
    "from itertools import dropwhile\n",
    "\n",
    "print(list(dropwhile(lambda x: x < 10, [1, 5, 7, 10, 30, 2, 4])))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7.3.12.\ttee()函数\n",
    "\n",
    "语法：itertools.tee(iterable, n=2)\n",
    "\n",
    "tee() 比较简单，从一个迭代器生成几个相同的迭代器。前面我们讨论过，由于迭代器的特殊性，当你访问过一个迭代器后，数据就没了。用 list 转换并不是一个很好的办法，因为使用迭代器本身就是为了节约内存，如果用 list 复制了，这个特点也就没有了。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\n",
      "[]\n",
      "[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]\n",
      "[]\n"
     ]
    }
   ],
   "source": [
    "from itertools import tee\n",
    "\n",
    "i1, i2 = tee(range(10), 2)\n",
    "print(list(i1))\n",
    "print(list(i1))\n",
    "print(list(i2))\n",
    "print(list(i2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 7.3.13.\tzip_longest()函数\n",
    "\n",
    "语法：itertools.zip_longest(*iterables, fillvalue=None)\n",
    "\n",
    "还记得前面强调的 zip()吧，这个迭代函数 zip_longest()也是用来将多个可迭代对象 zip 起来，然后如果长短不一，则以 fillvalue 为填充值，结果还是一个可以迭代的序列。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "('A', '1')\n",
      "('B', '2')\n",
      "('C', 'blank')\n",
      "('D', 'blank')\n"
     ]
    }
   ],
   "source": [
    "from itertools import zip_longest\n",
    "\n",
    "for item in zip_longest('ABCD', '12', fillvalue='blank'):\n",
    "    print(item)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
